Filtering Degenerate Patterns with Application to Protein Sequence Analysis
نویسندگان
چکیده
In biology, the notion of degenerate pattern plays a central role for describing various phenomena. For example, protein active site patterns, like those contained in the PROSITE database, e.g., [FY ]DPC[LIM ][ASG]C[ASG], are, in general, represented by degenerate patterns with character classes. Researchers have developed several approaches over the years to discover degenerate patterns. Although these methods have been exhaustively and successfully tested on genomes and proteins, their outcomes often far exceed the size of the original input, making the output hard to be managed and to be interpreted by refined analysis requiring manual inspection. In this paper, we discuss a characterization of degenerate patterns with character classes, without gaps, and we introduce the concept of pattern priority for comparing and ranking different patterns. We define the class of underlying patterns for filtering any set of degenerate patterns into a new set that is linear in the size of the input sequence. We present some preliminary results on the detection of subtle signals in protein families. Results show that our approach drastically reduces the number of patterns in output for a tool for protein analysis, while retaining the representative patterns.
منابع مشابه
Designing Of Degenerate Primers-Based Polymerase Chain Reaction (PCR) For Amplification Of WD40 Repeat-Containing Proteins Using Local Allignment Search Method
Degenerate primers-based polymerase chain reaction (PCR) are commonly used for isolation of unidentified gene sequences in related organisms. For designing the degenerate primers, we propose the use of local alignment search method for searching the conserved regions long enough to design an acceptable primer pair. To test this method, a WD40 repeat-containing domain protein from Beauveria bass...
متن کاملQuantum current modelling on tri-layer graphene nanoribbons in limit degenerate and non-degenerate
Graphene is determined by a wonderful carrier transport property and high sensitivityat the surface of a single molecule, making them great as resources used in Nano electronic use.TGN is modeled in form of three honeycomb lattices with pairs of in-equivalent sites as {A1, B1},{A2, B2}, and {A3, B3} which are located in the top, center and bottom layers, respectively. Trilayer...
متن کاملIsolation of the Gene Coding for Movement Protein from Grapevine Fanleaf Virus
A pair of degenerate primers, GMPF1 and GMPR1, was designed on the basis of alignment of previously reported Grapevine fanleaf virus (GFLV) movement protein (MP) nucleotide sequences from Iran and other parts of the world. cDNA was synthesized by the use of Oligo d(T)18 from total RNA extraction from each diseased grapevine leaf sample and subjected to polymerase chain reaction (PCR) with the d...
متن کاملApplication of Single-Frequency Time-Space Filtering Technique for Seismic Ground Roll and Random Noise Attenuation
Time-frequency filtering is an acceptable technique for attenuating noise in 2-D (time-space) and 3-D (time-space-space) reflection seismic data. The common approach for this purpose is transforming each seismic signal from 1-D time domain to a 2-D time-frequency domain and then denoising the signal by a designed filter and finally transforming back the filtered signal to original time domain. ...
متن کاملiProsite: an improved prosite database achieved by replacing ambiguous positions with more informative representations
PROSITE database contains a set of entries corresponding to protein families, which are used to identify the family of a protein from its sequence. Although patterns and profiles are developed to be very selective, each may have false positive or negative hits. Considering false positives as items that reduce the selectiveness of a pattern, then, the more selective pattern we have, a more accur...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Algorithms
دوره 6 شماره
صفحات -
تاریخ انتشار 2013